Checking the version of python used in this notebook
import sys
assert sys.version_info >= (3, 7)
Kaggle provides the following description of each of the biases
Findings about the biases in data - Kaggle
In this code we look at how biases may be brought across from the data into the code and how its run. For example, in one part of the tutorial, we were looking at how historical bias has come in to alter the results in an innacurate way.
from IPython.display import Image
Image (filename = "Kaggle_code_1.png", width = 600, height = 300)
Image (filename = "Kaggle_code_2.png", width = 400, height = 150)
Image (filename = "Kaggle_code_3.png", width = 600, height = 300)
Image (filename = "Kaggle_code_4.png", width = 400, height = 150)
The code above shows how, though not toxic, the model associates wrong things as being toxic. This shows a bias in favour of white over black in the code. This may be an example of historical bias on behalf of the creator.
I first begin by ensuring Word2Vec 10k is selected as my Tensor. Typing words in like Apple, Silver, Sound, I am able to see the cluster data and where these words are located within in the three dimensional data projection. After typing in a word, I am able to isolate points - this removes clutter and presents with the upmost related words. Words with similar relations tend to appear closer together, this can be seen with words like phone and computer, cat and dog.
Gender bias may be an issue that coders must consider - automatically, some words are associated withc certain biases. For example, in relation to professions, a lawyer is typically a male-dominated profession, while teaching is predominantly a female profession.
Image (filename = "Embedding_Projector_Man.png", width = 600, height=300)
Image (filename = "Embedding_Projector_Lawyer.png", width = 600, height=300)
Image (filename = "Embedding_Projector_Woman.png", width = 600, height=300)
Image (filename = "Embedding_Projector_Teacher.png", width = 600, height=300)
From these select words, using this program, you can notice there is some gender bias that is notable - however, whether this is avoidable is questionable. I believe am effort has been made by the programmers to remove the issue of gender bias. Even though a teacher is a profession that is usually associated as being a female proffession, the word "father" is just below it, and the word "man" is not far away also.
Assessing the fairness of AI before it's deployed to the real world.
Kaggle provides the following description of each of these
Image (filename = "Kaggle_Fairness_1.png", width = 600, height=300)
Image (filename = "Kaggle_Fairness_2.png", width = 400, height=150)
Image (filename = "Kaggle_Fairness_3.png", width = 600, height=300)
Image (filename = "Kaggle_Fairness_4.png", width = 500, height=200)
Image (filename = "Kaggle_Fairness_5.png", width = 750, height=300)
Image (filename = "Kaggle_Fairness_6.png", width = 800, height=300)
Image (filename = "Kaggle_Fairness_7.png", width = 600, height=300)
Image (filename = "Kaggle_Fairness_8.png", width = 500, height=200)
Image (filename = "Kaggle_Fairness_9.png", width = 800, height=400)
Image (filename = "Kaggle_Fairness_10.png", width = 500, height=200)
Calculating the permutation importance, using Kaggle.
Permutation importance is a helpful tool in detecting the important features that have the biggest impact on predictions. Kaggle states that Permutation importance is:
Image (filename = "Kaggle_Permutation_Importance_1.png", width = 500, height=200)
Image (filename = "Kaggle_Permutation_Importance_2.png", width = 800, height=300)
Image (filename = "Kaggle_Permutation_Importance_5.png", width = 500, height=200)
Image (filename = "Kaggle_Permutation_Importance_6.png", width = 300, height=200)
Image (filename = "Kaggle_Permutation_Importance_7.png", width = 650, height=200)
Image (filename = "Kaggle_Permutation_Importance_8.png", width = 300, height=200)